Pruning for Monotone Classification Trees
نویسندگان
چکیده
For classification problems with ordinal attributes very often the class attribute should increase with each or some of the explanatory attributes. These are called classification problems with monotonicity constraints. Standard classification tree algorithms such as CART or C4.5 are not guaranteed to produce monotone trees, even if the data set is completely monotone. We look at pruning based methods to build monotone classification trees from monotone as well as nonmonotone data sets. We develop a number of fixing methods, that make a non-monotone tree monotone by additional pruning steps. These fixing methods can be combined with existing pruning techniques to obtain a sequence of monotone trees. The performance of the new algorithms is evaluated through experimental studies on artificial as well as real life data sets. We conclude that the monotone trees have a slightly better predictive performance and are considerably smaller than trees constructed by the standard algorithm.
منابع مشابه
Application of classification trees-J48 to model the presence of roach (Rutilus rutilus) in rivers
In the present study, classification trees (CTs-J48 algorithm) were used to study the occurrence of roach in rivers in Flanders (Belgium). The presence/absence of roach was modelled based on a set of river characteristics. The predictive performance of the CTs models was assessed based on the percentage of Correctly Classified Instances (CCI) and Cohen's kappa statistics. To find the best model...
متن کاملStudy of Various Decision Tree Pruning Methods with their Empirical Comparison in WEKA
Classification is important problem in data mining. Given a data set, classifier generates meaningful description for each class. Decision trees are most effective and widely used classification methods. There are several algorithms for induction of decision trees. These trees are first induced and then prune subtrees with subsequent pruning phase to improve accuracy and prevent overfitting. In...
متن کاملUse of classification tree methods to study the habitat requirements of tench (Tinca tinca) (L., 1758)
Classification trees (J48) were induced to predict the habitat requirements of tench (Tinca tinca). 306 datasets were used for the given fish during 8 years in the river basins in Flanders (Belgium). The input variables consisted of the structural-habitat (width, depth, gradient slope and distance from the source) and physic chemical (pH, dissolved oxygen, water temperature and electric conduct...
متن کاملJ-measure Based Hybrid Pruning for Complexity Reduction in Classification Rules
Prism is a modular classification rule generation method based on the ‘separate and conquer’ approach that is alternative to the rule induction approach using decision trees also known as ‘divide and conquer’. Prism often achieves a similar level of classification accuracy compared with decision trees, but tends to produce a more compact noise tolerant set of classification rules. As with other...
متن کاملA Trade-Off Between Depth and Impurity for Pruning Decision Trees
Most pruning methods for decision trees minimize a classification error rate. In uncertain domains, some sub-trees which do not lessen the error rate can be relevant to point out some populations of specific interest or to give a representation of a large data file. We propose here a new pruning method (called pruning) which takes into account the complexity of sub-trees and which is able to ke...
متن کامل